My name is Gilbert Permalloo. I am a Research Project Officer and I am presently working on roots architecture and rhizosphere of wheat. I worked in sugarcane agronomy for about 24 years and I was doing a little bit of basic programming in Fortran 77 and GWBasic about 30 years ago. Most of my data manipulation and visualisation are done in Excel. I could not write any code in R before I joined Data School and I was spending lots of time working with data in spreadsheets. On the otherhand, I am amazed to witness every day the marvel that R can do with data manipulation and visualisation.
The aim of this project is to investigate the use of portable X-Ray fluorescense spectrocopy (pXRF) as a rapid method to quantify the amount of phosphorus accumulated in straw and grains. Photo-1 shows pXRF used. About 200 grab samples were taken from one of three phosphorus trials (at 0kg and 30kg per hectare as treatments) for this study. The straw and grains were ground, and scanned by the pXRF. Two large datasets were generated by the pXRF; the chemistry dataset is composed of a wide range of chemical elemental composition quantified in ppm, whereas the beamspectra, are spectral values from three X-ray beams. R has been used to clean, tidy and re-organised the data, as well as for graphical visualisation of data. Data for phosphorus have been extracted from the large pXRF generated-dataset and merged with a dataframe that contains unique identification numbers (SampleID) that links the data to the sample source (STEM_ID) and other tables that contain agronomical data for each sample.
Figures 1 and 2 show a relatively higher amount of phosphorus detected in the grains as compared to the straw. A slightly higher correlation between straw and grains for phosphorus assimilated under the 30kg/ha P treatment in comparison to the 0kg/ha P. Figure 2 shows that the level of P detected in the straw does not correlate with the grain yield. However, the amount of phosphorus detected in grains for the two treatments correlate differently to yield. Grains produced under the low treatment tend to show a slight positive correlation to yield as opposed to the 30kg/ha treatment, which has a negative correlation to yield.
This section will demonstrate the different visuals you might want use to show off your project. Don’t feel the need to go overboard, this is supposed to give a taste of the work you are doing rather than being a publication ready document.
data_straw_grain <- read_csv("clean_data/straw_grain_p.csv")
data_straw_grain_p <- data_straw_grain %>%
select(STEM_ID, GENOTYPE, SUBSAMPLE.x, straw_pconc, PAP.x, grain_pconc, yield) %>%
rename(`P in straw` = straw_pconc,`P in grain` = grain_pconc, `P level (kg/ha)` = PAP.x)
knitr::kable(head(data_straw_grain_p, n = 5),
format = "html",
caption = "Amount of phosphorus (ppm) detected in straw and grains data") %>%
kable_styling("striped")| STEM_ID | GENOTYPE | SUBSAMPLE.x | P in straw | P level (kg/ha) | P in grain | yield |
|---|---|---|---|---|---|---|
| ST50PKT0WD9S | CAV4081442 | Straw | 402 | 0 kg P | 2699 | 1.802632 |
| ST50PKT0WCYG | CAV4080777 | Straw | 433 | 0 kg P | 3587 | 2.854010 |
| ST50PKT0WD1B | CAV4081233 | Straw | 659 | 30 kg P | 4627 | 3.201220 |
| ST50PKT0WCD2 | CAV4080976 | Straw | 334 | 0 kg P | 3843 | 2.881579 |
| ST50PKT0WC08 | CAV4081051 | Straw | 430 | 30 kg P | 4440 | 3.059210 |
Photo-1: pXRF instrument used to quantify amount of phosphorus in straw and grains
straw_grain_p <- read_csv("clean_data/straw_grain_p.csv")
straw_grain <- ggplot(data = straw_grain_p,
mapping = aes(x = grain_pconc,
y = straw_pconc,
colour = PAP.x
)) +
geom_point(alpha = 0.2) +
geom_smooth(method = "lm", size = 0.5, se = FALSE)
straw_grain +
labs(x = "Grains",
y = "Straw") +
geom_point(alpha = 0.2)Figure 1: Amount of phosphorus (ppm) in straw vs in grains
good_data <- read_csv("clean_data/good_data_york_pxrf.csv")
yield_Pconc <- ggplot(data = good_data,
mapping = aes(x = yield,
y = `P Concentration`,
colour = PAP,
shape = SUBSAMPLE
)) +
geom_point() +
geom_smooth(method = "lm", size = 0.5, se = FALSE)
yield_Pconc +
labs(x = "Yield (Kg)",
y = "P Concentration (ppm)") +
geom_point()Figure 2: Amount of phosphorus in straw and grains vs grain yield
For this project, I have been using the R version 3.6.1 and the digital tools tidyverse, ggplot2, kableExtra, imager, data.table, readxl, lubridate
Most of my time went into tidying up and cleaning the data. Then I realised how crucial it is to understand the “how” and “what” data to be collected, and the structure and formatting - not to forget how and where they are stored. I wrote codes in R to resolve the issues and bring them together in one clean data set that can be reused by anyone at anytime in the future.
I will continue to use R for data manipulation so that I improve my skills and build up my trust in good reusable data. I am looking forward to use R from preparing trial design, data manipulation, visualisation, analysing and to publishing.
My Data School experience has been a challenging, exciting and most of all very enriching. One of the challenges is to make a mind-shift to learn and adopt a new platform to manipulate data safely and with a high level of explicity and repeatability, that is far away from a risky and bad habit of working in spreadsheets. <This poster is mostly about your synthesis project. However we would also like to hear about other parts of your Data School experience. What aspects of the program did you really enjoy? How have you been applying the skills you have learned in your daily work? Have you been able to transfer this knowledge to your team members? Concrete examples demonstrating this would be useful here (meetings/talks/collaborations/new roles). Any descriptions of the personal impact the program has had are welcome here as well!>